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Abstract. Adaptive (variable-length) codes associate variable-length codewords to symbols being 
encoded depending on the previous symbols in the input data string. This class of codes has been 
presented in 1 1 Ql 1 1 11 as a new class of non-standard variable-length codes. Generalized adaptive 
codes (GA codes, for short) have been also presented in 1101 II II not only as a new class of non- 
standard variable-length codes, but also as a natural generalization of adaptive codes of any order. 
This paper is intended to continue developing the theory of variable-length codes by establishing 
several interesting connections between adaptive codes and other classes of codes. The connections 
are discussed not only from a theoretical point of view (by proving new results), but also from an 
applicative one (by proposing several applications). First, we prove that adaptive Huffman encodings 
and Lempel-Ziv encodings are particular cases of encodings by GA codes. Second, we show that any 
(n, l,m) convolutional code satisfying certain conditions can be modelled as an adaptive code of 
order m. Third, we describe a cryptographic scheme based on the connection between adaptive codes 
and convolutional codes, and present an insightful analysis of this scheme. Finally, we conclude by 
generalizing adaptive codes to (p, g)-adaptive codes, and discussing connections between adaptive 
codes and time-varying codes. 



Keywords: adaptive codes, convolutional codes, error-correcting codes, generalized adaptive codes, 
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1. Introduction 

The theory of variable-length codes, one of the most studied areas of coding theory, continues to play 
an important role not only in the evolution of formal languages, but also in some applicative areas of 
computer science such as data compression. The aim of this paper is to continue developing and enriching 
this theory with new results, along with showing their effectiveness in concrete applications. 
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Specifically, we continue our study on adaptive codes, which have been recently presented in HI Oil 1 1 1 
as a new class of non-standard variable-length codes. Intuitively, an adaptive code of order n associates 
a codeword to the symbol being encoded depending on the previous n symbols in the input data string. 
Generalized adaptive codes (GA codes, for short) have been also presented in [ 10 11] not only as a new 
class of non-standard variable-length codes, but also as a natural generalization of adaptive codes of any 
order. 

Both classes are described in detail in section 2. Then, we show that adaptive Huffman encodings 
and Lempel-Ziv encodings are particular cases of encodings by GA codes (sections 3 and 4). In section 
5, we show that any (n, l,m) convolutional code satisfying a certain condition can be modelled as an 
adaptive code of order m. This result is exploited further in section 6, where an efficient cryptographic 
scheme based on convolutional codes is described. An insightful analysis of this cryptographic scheme 
is provided in the same section. In sections 7 and 8, we extend adaptive codes to (p, q)-adaptive codes, 
and present a new class of variable-length codes, called adaptive time-varying codes. 

In the remainder of this introductory section, we recall some basic notions and notations used 
throughout the paper. We denote by \S\ the cardinality of the set S; if x is a string of finite length, 
then \x\ denotes the length of x. The empty string is denoted by A. 

For an alphabet S, we denote by S* the set |J^L S n and by £+ the set \J™ =1 S n , where S° is the set 
{A}. Also, we denote by S- n the set U™=o E* and by S- n the set (J~ n £*. Let us consider an alphabet 
A, X a finite and nonempty subset of A + , and w € A + . A decomposition of w over X is any sequence 
of strings u\, U2, ■ ■ ■ ,u n with u\ E X for all i, 1 < i < h, such that w = u\U2 ■ ■ ■ Uh- A code over A 
is any nonempty set C C A + such that each string w G A + has at most one decomposition over C. A 
prefix code over A is any code C over A such that no string in C is proper prefix of another string in C. 

If A is an algorithm and x its input, then we denote by A(x) its output. Also, we denote by N the set 
of natural numbers, and by N* the set of nonzero natural numbers. 

Finally, let us fix some useful notations which will be used in the description of the algorithms. 
Let U = (tii, U2, ■ ■ ■ , Uk) be a A;-tuple. We denote by U.i the i-th component of U, that is, U.i = 
Ui for all i G {1,2, ... ,k}. The 0-tuple is denoted by (). The length of a tuple U is denoted by 
Len(U). If V = (vi, v%, . . . , Vb), M = {m\,vri2, ■ ■ ■ , m r ,U), M = (m, ri2, n s ,V), and V = 
(pi, . . . ,pi-\,pi,pi + i, . . . ,pt) are tuples, and q is an element or a tuple, then we define V < q, V > i, 
U A V, and .M OA/" by: 

• V < q= (pi,...,Pt,q), 

• V > i = (p 1 ,...,p i -i,p i+ i,...,pt), 

• U A V = (ut,U2, ■ ■ .,U k ,Vl,V 2 , ■ ■ ■ ,v b ), 

• M<)M = (mi +n 1 ,rri2 + 1, . . . ,m r + l,n 2 + 1, . . . ,n s + l,U A V), 
where m\ , rri2 , . . . , m T , n\ , rt2 , • • • , n s are integers. 

2. Adaptive codes and GA codes 

The aim of this section is to briefly review some basic definitions, results, and notations related to adap- 
tive codes and generalized adaptive codes lHOllTTl . 
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Definition 2.1. Let £ and A be two alphabets. A function c: Ex £- n — > A + , n > 1, is called adaptive 
code of order n if its unique homomorphic extension c : £* — > A*, given by: 

• c(A) = A, 

• c{o\02 . ..a m ) = c(ai,X) c(a 2 ,ai) . . . c(a n ^i,aia 2 ■ ■ .a n - 2 ) 
c{a n ,a\a 2 . . . cr n -i) c{a n+ i,a\a 2 ■ ■ ■ cr n ) c(a n+2 ,a 2 a 3 . . . o n+ i) 
c(o- n+3 , 0-30-4 . . . a n+2 ) ... c(a 

mi 0~m—nO~m—n+l ■ ■ ■ 0~m—l) 

for all strings o\o 2 . . . a m G S + , is injective. 

As it is clearly specified in the definition above, an adaptive code of order n associates a variable-length 
codeword to the symbol being encoded depending on the previous n symbols in the input data string. Let 
us take an example in order to better understand this mechanism. 

Example 2.1. Let £ = {a, b} and A = {0, 1} be two alphabets, and c : S x S- 1 — > A + a function 
given as in the table below. One can verify that c is injective, and according to Definition 12. II it follows 
that c is an adaptive code of order one. 



Table 1 . An adaptive code of order one. 





a 


b 


A 


a 





1 


00 


b 


10 


00 


11 



Let x = abaa G S + be an input data string. Using the definition above, we encode x by 

c(x) = c(a, A)c(b,a)c(a,b)c(a,a) = 001010. 

Example 2.2. Let us consider S = {a, b, c} and A = {0, 1} two alphabets, and c : £ x S- 2 — > A + 
a function given as in the following table. One can easily verify that c is injective, and according to 
Definition l2.il c is an adaptive code of order two. 



Table 2. An adaptive code of order two. 





a 


b 


c 


aa 


ab 


ac 


ba 


bb 


be 


ca 


cb 


cc 


A 


a 





11 


10 


00 


1 


10 


01 


10 


11 


11 


11 





00 


b 


10 


000 


11 


11 


01 


00 


00 


11 


01 


101 


00 


10 


11 


c 


111 


01 


00 


10 


00 


11 


11 


00 


00 





10 


11 


10 



Let x = abacca G S + be an input data string. Using the definition above, we encode x by 
c(x) = c(a, A)c(b, a)c(a, ab)c(c, ba)c(c, ac)c(a, cc) = 0010111110. 
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Let c : E x E- n — > A + be an adaptive code of order n, n > 1. We denote by C Cj(J1(J2 ... crh the set 
{c(cr, (7icr2 . . . 07J | € E}, for all 0-10-2 . . . 0^ G £- n — {A}, and by C Cj a the set {c(a, A) | 0- G £}. 
We write C ai(T2 _„ ah instead of C Cj(71(J2 ... (7;i , and C\ instead of C c> \ whenever there is no confusion. Let 
us denote by AC(£, A, n) the set 

{c : E x E- n — > A + I c is an adaptive code of order n}. 

Theorem 2.1. Let £ and A be two alphabets, and c:Ex £- n — > A + a function, n > 1. If C u is prefix 
code, for all u G £^ n , then c G AC(£, A, n). 

Proof: 

Let us assume that C aia2 ,,, (Th is prefix code, for all 010-2 • • • &h G E- n , but c ^ j4C(S, A, n). By 
Definition l2.il the unique homomorphic extension of c, denoted by c, is not injective. This implies that 

3 uau', ua'u" G S + , with a, a' G S and u, n', u" G £*, such that a ^ a' and 

c(uau) = c(ua'u"). (1) 

We can rewrite the equality (1) by 

c(u)c(0-,P„(it))c(it') = c(u)c(0- / ,P n (ii))c(u"), (2) 
where the function P n {-) is given as below. 



Pn(u) 



A if u = A. 

iti . . . Uq if it = U1U2 • • • Uq and u±,U2, ■ ■ ■ ,u q G E and q < n. 

Uq-n+i ■ ■ - u q if u = U\U2 . . . u q and u±, U2, ■ ■ ■ , u q G E and q > n. 



By hypothesis, C Pn(u ) is prefix code and c(a, P n {u)), c(a' , P n {u)) G C Pn ( u y 

Therefore, the set {c(a, P n (u)),c(a' , P n (ii))} is a prefix code. But the equality (2) holds true if and 
only if {c(a, P n (u)), c(a' ,P n (u))} is not a prefix set. Thus, our assumption leads to a contradiction. □ 

Definition 2.2. Let F : N* x E+ -> E* be a function, where N* denotes the set N - {0}. A function 
cf ■ E x E* — > A + is called generalized adaptive code (GA code, for short) if its unique homomorphic 
extension cp ■ E* — > A*, given by: 

• cf(A) = A, 

• CF(0-102 • • • 0"m) = C F (ai,F(l, 0-1CT2 . . . 0m)) . . . C F ((T m , F{m, 0-10-2 . . . CT m )) 

for all strings 0-10-2 . . . cr m G E + , is injective. 

Remark 2.1. The function F in Definition 12. 21 is called the adaptive function corresponding to the GA 
code c F . Clearly, a GA code cf can be constructed if its adaptive function F is already constructed. 

Remark 2.2. Let E and A be two alphabets. We denote by GAC(T,, A) the set 

{c F : E x S* -> A+ [ c F is a GA code}. 
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F(i,aia 2 . . . cr m ) = < 



The following theorem proves that adaptive codes (of any order) are special cases of GA codes. 
Theorem 2.2. Let £ and A be alphabets. Then, AC(S, A, n) C GAC(E, A) for all n > 1. 
Proof: 

Let cp G AC(S, A, n) be an adaptive code of order n, n > 1, and : N* x S + — > S* a function given 
by: \ 

A if i = 1 or z > m. 

<J\<J2 ■ ■ ■ Pi-i if 2 < i < m and i < n + 1. 

(Tj_ n cJi_ n +i . . . <7j_i if 2 < z < m and i > n + 1. 

for all > 1 and o\02 ■ ■ ■ <T m £ One can verify that \F(i, o\02 ■ ■ ■ cr m )\ < n, for all * > 1 and 
o"i<72 . . . <7 m € S + . According to Definition l2.ll the function cf is given by: 

• cf(A) = A, 

• Cf(ctiCT2 • ..<Tm) = Cf(o"1,A) Cf(o- 2 ,CTi) . . . Ci?(<7 n _i,cricr 2 . . .cr n _ 2 ) 
Ci?((T n , (Tl(T2 . . . <T n _l) Ci?((T n+ l, (71 (72 • • • C n ) Cir((T n+2 , (T2CT3 . . . <7 n +l) 
CF(Cn+3> CT3°"4 • • • Cn+2) • • • C^((T TO , £J m _ n (T m _ n+ i . . . CT m _i) 

for all strings o"xcr 2 . . . cr m G S + . It is easy to remark that 

cF(cjio- 2 . ..a m ) = c F (a 1 ,F(l,a 1 a 2 . ..cr m )) . . . c F (a m , F(m, aia 2 . . . cr m )) 
for all <7i<72 . . . cr m € S + , which proves the theorem. □ 

The adaptive mechanism in Definition 12.21 can be illustrated by the figure below. More precisely, the 
figure captures the idea behind this mechanism: the codeword associated to the current symbol depends 
on the symbol itself and a sequence of symbols chosen by the adaptive function. 





F 






t 





0-1 



(Jr 



ENCODER 



-c F {a h F(i,ai ...<T m )) 



Figure 1. Encoding with a GA code. 



Example 2.3. Let E and A be two alphabets, cf : S x S* -> A + a GA code, and F : W x S H 
its adaptive function. Let us consider F given as below. 



F(z,CJiCr 2 ...CJ r , 



j A if i = 1 or i > m. 
1 cjj_i if 2 < i < m. 

One can trivially verify that the function c F is also an adaptive code of order one. 
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3. GA codes and adaptive Huffman codes 

In this section, we prove that adaptive Huffman encodings are particular cases of encodings by GA codes. 
This result can be exploited further in data compression to develop efficient compression algorithms; for 
example, the algorithms presented in [ 11 1 combine adaptive codes with Huffman's classical algorithm. 

The well-known Huffman algorithm is a two-pass encoding scheme, that is, the input must be read 
twice. The version used in practice is called the adaptive Huffman algorithm, which reads the input only 
once. Intuitively, the encoding of an input data string using the adaptive Huffman algorithm requires the 
construction of a sequence of Huffman trees. 

Let E be an alphabet, and w = W1W2 ■ ■ -w n a string over E. Denote by %(w), T\(w), . . . ,Th(w) 
the sequence of Huffman trees constructed by the adaptive Huffman algorithm for the input string w. 
The Huffman tree Tq(w) is associated to the alphabet S (with the assumption that each symbol in S has 
frequency 1). For all i 6 {1,2, ... , h}, the Huffman tree %{w) (associated to the string w\W2 ■ ■ ■ W{) is 
obtained by updating the tree %^i(w). 

The procedure via this update takes place is called the sibling transformation, which can be described 
as follows. Let %{w) be the current tree and k the frequency of Wi+i; the tree % + i(w) is obtained from 
Ti(w) by applying the following algorithm: compare wi + \ with its successors in the tree (from left to 
right and from bottom to top). If the immediate successor has frequency k + 1 or greater, then we do 
not have to change anything. Otherwise, Wi + \ should be swapped with the last successor which has 
frequency k or smaller (only if this successor is not its parent). The frequency of Wi + i is incremented 
from k to k + 1. If Wi + \ is the root of the tree, then the loop terminates. Otherwise, it continues with the 
parent of u?j + i (for further details on Huffman trees and the adaptive Huffman algorithm, the reader is 
referred to [8]). 

The codeword associated to the symbol a in the Huffman tree %{w) is denoted by code{a,%(w)), 
foralH G {0,1,..., h}. 

Theorem 3.1. Adaptive Huffman encodings are particular cases of encodings by GA codes. 
Proof: 

Let E and A be two alphabets, w a string over E, and F : N* x E + — » E*, cp : E x E* — > A+ two 
functions. Let us consider the function F given by: 

I A if i = 1 or i > m. 

F(i,aia 2 ■ ..<r m ) = < . 

\ a\02 ■ ■ ■ Cj-i otherwise. 

and the function cp by cp(a, u) = code(a, T\ u \{u)), for all (a, u) € E x E*. Let us assume that cp is 
not injective, that is, 3 uav, ua'v' € E + such that a, a' € E, a ^ a' and cp(uav) = cp(ua'v'). The 
previous equality can be rewritten by: 

cp{u)code{a, T\ u \ (u))cp(v) = cp(u)code(a' , Ti u i(u))cTf(v'). (3) 

Due to the prefix property of the set {code(a, T\ u \(u)), code (a' , T\ u \(u))}, the equality (3) cannot hold 
true, which leads to the conclusion that our assumption is false. Thus, we conclude that cp is a GA code, 
which proves the theorem. □ 

Remark 3.1. If u is a prefix of w, then %{u) = %{w), for all i < \u\. 
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Example 3.1. Let E = {a, b, c, d} be an alphabet, and w = bcabd G E + . Applying the adaptive 
Huffman algorithm to the input string w, we get the following Huffman trees. 




(a) %(w) (b) Ti(w) (c)T 2 (w) (d)T 3 (w) (e) %{w) (f)T 5 (w) 



Figure 2. The Huffman trees associated to w: Tq(w), Ti(w), T2(w), %(w), Ta{w), and %{w). 



Let F : N* x E+ — >• E*, cp ■ E x E* — > {0, 1} + be constructed as above. Then, we encode w 
by: cp(bcabd) = cp(b, \)cf(c, b)cp(a, bc)ci?(b, bca)ci?(d, bcab) = code(b, 7o(A))coc?e(c, 71(b)) 
code (a, T 2 (be)) code (b, T 3 (bca)) code (d, 7^ (bcab)) = 0110001100. 



4. GA codes and Lempel-Ziv codes 

The aim of this section is to prove that Lempel-Ziv encodings are particular cases of encodings by GA 
codes. Let E and A be two alphabets such that {0, 1, . . . , 9} n E = 0. First, we recall the Lempel-Ziv 
parsing procedure of an input data string w, where w = W1W2 ... u;^ is a string over E. For more details, 
the reader is referred to [ 16l [T7ll . 

The first variable-length block arising from the Lempel-Ziv parsing of the data string w is w\. The 
second block in the parsing is the shortest prefix of ■ ■ ■ wu which is not equal to w\. Consider that this 
second block is ■ ■ ■ Wj. Then, the third block will be the shortest prefix of Wj + % . . . which is not 
equal to either w\ or w>z . . . Wj. Suppose the Lempel-Ziv parsing has produced the first k variable-length 
blocks B\, B2, ■ ■ ■ , -Bfc in the parsing, and is that part left of w after Bi, B2, . . . , -Bfc have been 
removed. Then, the next block Bj. + i in the parsing is the shortest prefix of which is not equal to any 
of the preceding blocks B\, B2, ■ . . , B^ (if there is no such block, then Bk+i = uu^ and the Lempel-Ziv 
parsing procedure terminates). 

Theorem 4.1. Lempel-Ziv encodings are particular cases of encodings by GA codes. 
Proof: 

Let Ei = E U {0, 1, . . . , 9} be an alphabet, 07 G E a fixed symbol, and let F : N* x E^ -> E|, 
cf ■ Ei x E* — > {0, 1}* be two functions. 

Let us consider F given by F(i, o~\o~2 ■ ■ ■ o" m ) = 11I2 ■ ■ ■ iqO~$o-\o~2 ■ ■ ■ o~ m , for all i G N* and 
o~\o~2 ■ ■ ■ o~ m G E+, where h,i2, ■ ■ ■ ,iq G {0, 1, . . . , 9} are the digits corresponding to i (from left to 
right). 

Let u = u\U2 ... Up be a string over Ei, that is, m G Ei for all i G {1, 2, . . . ,p}. Consider the 
following notations. 

1 if p > 3 and 3 i G {2, 3, . . . ,p — 1}, such that U{ = 07 
fixed(u) = { and Uj G {0, 1, . . . , 9} for all j G {1, 2, . . . , % — 1}. 
otherwise. 
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left(u) 



right (u) 



u\U2 ■ ■ ■ u r if fixed(u) = 1, Uj € {0, 1, . . . , 9} for all 

i G {1, 2, . . . , r}, and u r+ \ = Of. 
A otherwise. 

| v if fixed (u) = 1 and u = left{u)a fV. 
I A otherwise. 



1 if fixed(u) 

• goodposyu) = < 

I otherwise. 

Let us consider cf given by 

LZ(a, a\U2 



1 and |te/t(u)| + 2 < fe/t(u) < |«|. 



cjr(cr,(ricr2 . . . a m ) = < 



A 



Cm) if fixe.d(p\02 ■ ■ ■ cr m ) = 1> goodpos{o\02 ■ ■ - o r , 
and a = o- fe/t(o . lCT2 ... CTro ). 
otherwise. 



where LZ(a, <J\Oi . . . a m ) is defined as follows: let B\,B2, ■ ■ ■ ,B t be the blocks arising from the 
Lempel-Ziv parsing of the string right(o\(J2 ■ ■ ■ ff m ), and 



\left(a 1 a 2 ...a-r n )\+2+j- 



. . .a 



\left(a 1 a 2 ...(T m )\ +2+j 2 > 



(4) 



where 2 G {1, . . . , £}, < ji < 32 < \rigih(o\02 ■ • • cr m )\ — 1, and 

\left(aia 2 . • • cr TO )| + 2 + ji < left(aia 2 ■ ■ ■ v m ) < \left(aia 2 ■ ■ ■ a m )\ + 2 + j 2 . (5) 

If left{a\(T2 ■ ■ ■ &m) = |te/t(<7i<72 • • • o"m)| + 2 + j 2 , then let LZ(a, o\o 2 ■ ■ ■ cr m ) be the codeword as- 
sociated by the Lempel-Ziv data compression algorithm to the block B z . Otherwise, we consider that 
LZ(a, o\o 2 - ■ ■ <T m ) = A. One can easily verify that cF(o"i(J2 . . . a m ) is the encoding of o\02 ■ ■ ■ o m by 
the Lempel-Ziv data compression algorithm, for all a\a 2 ■ ■ ■ c m G H\. Thus, we have obtained that cF 
is injective, which proves the theorem. □ 

Example 4.1. Let S = {a, b, c}, S x = S U {0, 1, . . . , 9} be two alphabets, and let F:N*xS+^ EJ, 
cp : Si x — > {0,1}* be two functions given as in Theorem 14. 1 1 (considering <jf = a). Also, let 
w = bcc7ba G S^ be an input string. 

Applying the Lempel-Ziv parsing procedure to the input string w, we get the following blocks: 
B\ = b, B2 = c, B% = c7, and B4 = ba. Let us denote by codeLZ(Bi) the codeword associated 
by the Lempel-Ziv encoder to the block Bi, for all i G {1,2,3,4}. One can verify that we get the 
following results: 



• codeLZ(Bi) 

• codeLZ (B 2 ) 

• codeLZ (B3) 

• codeLZ (B 4) 



1011, 
01100, 
100001, 
010111. 
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Finally, we encode w = bcc7ba by the GA code cp as shown below. 

cp(w) = c F (b,F(l,bcc7ba))c F (c,F(2,bcc7ba))c F (c,F(3,bcc7ba)) 
c F (7, F(4, bcc7ba))c F (b, F(5, bcc7ba))c F (a, F(6, bcc7ba)) 
= cp(b, labcc7ba)cp(c, 2abcc7ba)cp(c, 3abcc7ba) 
cf(7, 4abcc7ba)ci?(b, 5abcc7ba)c^(a, 6abcc7ba) 
= LZ(b, labcc7ba) • LZ(c, 2abcc7ba) • A- LZ(7, 4abcc7ba) • A • LZ(&, 6abcc7ba) 
= codeLZ (B\ ) codeLZ (B2) codeLZ (B3 ) codeLZ (B^ ) 
= 101101100100001010111. 



5. Adaptive codes and convolutional codes 

Convolutional codes are one of the most widely used channel codes in practical communication 
systems. These codes are developed with a separate strong mathematical structure and are primarily 
used for real time error correction. Convolutional codes convert the entire data stream into one single 
codeword: the encoded bits depend not only on the current k input bits, but also on past input bits. 
The same strategy is used by adaptive variable-length codes. The aim of this section is to discuss the 
connection between adaptive codes and convolutional codes. Specifically, we show how a convolutional 
code can be modelled as an adaptive code. Before stating the results, let us first present a brief description 
of convolutional codes. 

Convolutional codes are commonly specified by three parameters: n, k, and m, where 

• n is the number of output bits, 

• k is the number of input bits, 

• and m is the number of memory registers. 

The quantity km is called the constraint length, and represents the number of bits in the encoder memory 
that affect the generation of the n output bits. Also, the quantity k/n is called the code rate, and is a 
measure of the efficiency of the code. A convolutional code with parameters n, k, m is usually referred 
to as an (n, k, m) convolutional code. For an (n, k, m) convolutional code, the encoding procedure is 
entirely defined by n generator polynomials. Usually, these generator polynomials are represented as 
binary (m + l)-tuples. Also, throughout this section, we consider only (n, 1, m) convolutional codes. 

Let us consider an (n, 1, m) convolutional code with Pi, P2, . . . , P n being its generator polynomials, 
and let x = x\%2 ■ ■ ■ x t € {0, 1} + be an input data string. The string x is encoded by y = yiy2 ■ ■ ■ y n u 
where the substring . . . yin+n encodes the input bit Xj+i, for alH G {0, 1, . . . , t — 1}. Precisely, if 
i G {0, 1, . . . , t - 1} and j G {1, 2, . . . , n}, then 

y in+j = w^j +1 © • • • © w^j v 

1 2 1(3) 

where ... , i 3 q u\) = {z E {1,2, ... ,m + 1} \ Pj .z = 1}, i\ < i 3 2 <■■■ < ^ q uy © denotes the 

modulo-2 addition, and 



Wi-l 



x i-i+i if* — Z + 1 > 1. 
otherwise. 



for all / G {0, 1, . . . ,m}. 
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Example 5.1. Let us consider a (2, 1, 2) convolutional code with P\ = (0, 1, 1), P 2 = (1, 0, 1) being its 
generator polynomials. This convolutional code can be represented graphically as in the figure below. 



output 1 



input ■ 



mi 



m 2 



►0- 



output 2 



Figure 3. A (2, 1, 2) convolutional code. 



Let us now describe the encoding mechanism. Let b be the current input bit being encoded, and let b\ and 
62 be the current bits stored in the memory registers m\ and m 2 , respectively. Given that P\ = (0, 1, 1), 
the first output bit is obtained by adding (modulo-2) 61 and 62- The second bit is obtained by adding 
(modulo-2) b and 62- After both output bits have been obtained, b and bi become the new values stored 
in the memory registers mi and m 2 , respectively. For example, if x = 0101 is an input bitstring, one can 
verify that the output is 00011010 (for each input bit, the output is obtained by concatenating the two 
output bits). 

Theorem 5.1. Any (n, 1, m) convolutional code with Pi, P2, . . . , P n being its generator polynomials, 
and satisfying the condition 

{z e {1,2,..., n} I P z .\ = 1} ^0, 

is an adaptive code of order m. 



Proof: 

Let c : {0,1} x {0, l}- m — > {0, l} n be a function. Consider an (n, l,m) convolutional code with 
Pi, P2, ■ ■ ■ , P n being its generator polynomials. Also, let us consider that c is given by: 

c(x, X1X2 • • • *£p) — Pi [xXpXp— l . . . XlZp ~\P2\xXpXp—i . . . XiZp ] . . . P n ^XXpXp—l . . . XlZp ], 

for all x G {0, 1} and X1X2 ■ ■ ■ x p £ {0, l}- m , where 



• and Pj[bib 2 ■ ■ ■ b m +i] = bj b {J ■ ... b {J , with {i{,i 3 2 , i J (j) } = {z | Pj.z = 1} and 



'1 -2 - q (j) 

i{ < i& < ■ ■ ■ < i J q{jy 



Let 61&2 • • • b q £ {0, l}- m . By hypothesis, there exists j £ {1,2,..., n} such that Pj.l = 1. This leads 
to the conclusion that 

{c(0, h b 2 ...b q ), c{l,bib 2 ...b q )} 

is a prefix code. Thus, we have obtained that C u (as defined in section 2) is a prefix code, for all 
u € {0, l}- m . According to Theorem l2.il c is an adaptive code of order m. □ 
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6. A cryptographic scheme based on convolutional codes 

The results presented in the previous section lead to an efficient data encryption scheme. Specifically, ev- 
ery (1, 1, m) convolutional code satisfying the condition in Theorem l5.1l can be used for data encryption 
(and decryption), without any additional information. Let us consider an (1,1, m) convolutional code 
with P being its generator polynomial. If P.l = (that is, the condition in Theorem l5.1l is not satisfied), 
then the output bits depend only on the bits stored in the memory registers. For example, let b be the 
current input bit, and b±, 62 , ■ ■ ■ > b m the bits stored in the memory registers before encoding the bit b. The 
output bit b ou t depends, in this case, only on the bits 61, 62, • • • , b m . This makes the decryption procedure 
impossible (without any additional information), since the output cannot be uniquely decoded. Thus, we 
consider only (1, l,m) convolutional codes that satisfy the condition given in Theorem 15. II Also, we 
consider that any (1, 1, m) convolutional code is completely specified by 

• P, its generator polynomial, 

• and a binary m-uple Q, where Q.i denotes the bit stored initially in the memory register rrii, for 
all i E {1, 2, . . . ,m}. 

Public and Private Keys. Let us denote by Public the set of public keys, and by Private the set of 
private keys. There are three parameters in our cryptographic scheme: m, P, and Q. Note that by making 
P and/or Q available to any user, the parameter m is implicitly made available as well (since P consists 
of m + 1 elements, and Q has m elements). Thus, if P and Q are both public keys, then the information 
can be correctly decoded. Except for the case when both P and Q are public keys, all other cases lead 
to a powerful cryptographic scheme. The parameters P and Q shouldn't normally be among the public 
keys, since both P and Q give partial information about the encryption/decryption procedures. Thus, we 
consider that in practice only the parameter m should be included among the public keys. Keeping all 
three parameters as private keys increases the security level as well (by a constant factor). 



Table 3. Possible ways of partitioning the keys. 



Public 


Private 


Security level 


Complexity 





{P,Q,m} 


High 


C(4 m ) 


{m} 


{P,Q} 


High 


C(4 m ) 


{P} 


M 


High 


0(2 m ) 


{Q} 


{P} 


High 


0(2 m ) 



Security and Complexity. There are four ways of partitioning the keys, as shown in the table above. 
Note that if P or Q is a public key, then it doesn't make sense to include m as a public or private key, 
since if P or Q is made available then m is implicitly a public key. Let us discuss each case separately. 

Public = and Private = {P, Q, m}. In this case, an unauthorized user has no information about the 
encryption/decryption procedure. A possible attack cannot be more efficient than a naive search, 
starting with m = 1 and trying all possible cases for P and Q. Since there are 2 m possible binary 
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m-tuples, we can conclude that the total number of decoding attempts is at most 

^m+l _ ^ 2 2m + 2 _ 4 



2 1 • 2 1 + 2 2 • 2 2 + 



+ 2 m . 2 r, 



3 3 

For example, if m = 100, then the total number of decoding attempts is at most 

2 202 _ 4 



2.1 • 10 



on 



Definitely, the scheme is highly efficient in this case. 

Public = {m} and Private = {P, Q}. Even if m is a public key, the efficiency of our scheme is not 
affected at all. A possible attack must try all possible cases for P and Q (in the worst case). Thus, 

For m = 100, the total number 



the total number of decoding attempts is at most 2 m • 2" 
of decoding attempts is at most 2 200 w 1.6 ■ 10 60 . 



l2m 



Public = {P} and Private = {Q}. Since only Q is a private key in this case, we can conclude that the 
total number of decoding attempts is at most 2 m . For m = 100, 2 100 m 1.2 • 10 30 . 

Public = {Q} and Private = {P}. The total number of decoding attempts is at most 2 m , since only P 
is a private key in this case. 

Encryption and Decryption. A detailed description of the encryption algorithm is provided below. 
Note that m is a positive integer, P is a binary (m + l)-tuple that satisfies the condition in Theorem 15. II 
and Q is a binary m-tuple. 



Input : m, P, Q, and x — x\X2 ■ ■ ■ Xt 
Output : y G {0, 1} + 


G{0,1}+ 


5^0;y^ A 




For i <— 2 to m + 1 do 




If P.i = 1 then 




S ^ SU{i-l} 




Endif 




Endf or 




For i <— 1 to t do 




Z <— Xi 




For each j £ S do 




2 <- jz ffi Q.j 




Endf or 




y^-y-z;Q^(5l>m;(5<— 


A Q 


Endf or 





Figure 4. Convolutional encryption. 

As mentioned in the beginning of this section, the decryption algorithm is based on the equality P.I = 1. 
Let yi be the current bit being decoded, Q the content of the memory registers before decoding yi, 
S = i2, ■ ■ ■ , ij} the set of indexes of those memory registers that contribute to the output bit, and 
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z = Q.ii © Q.%2 © • • • © Q.ij. If yi = 0, we can conclude that X{ = z (since ffi z = 0). Otherwise, 
if yi = 1, it follows that Xj = z, where z denotes the complement of z. A complete description of the 
algorithm is given below. 



Input : y = iyij/ 2 • • • yt € {0, 1} 


f , m, P, and Q 


Output: x e {0, 1} + 




S <- 0; x <- A 




For i <— 2 to m + 1 do 




If P.i = 1 then 




s^su{i-i} 




Endif 




Endf or 




For i <— 1 to t do 




z <- 




For each j £ S do 




z® Q.j 




Endf or 




li z yi then 




2 <- 1 




Endif 




a;<— x-z;Q*— Q > m; 


Q <- (z) A Q 


Endf or 





Figure 5. Convolutional decryption. 



Example 6.1. Consider an (1, 1, 2) convolutional code with P = (1, 0, 1) and Q = (0, 1). Graphically, 
this code is represented as in the figure below. Initially, the memory register mi stores the bit (= QA), 



input ■ 





mi 




m 2 











- output 



Figure 6. An (1, 1, 2) convolutional code. 

and the memory register m 2 stores the bit 1 (= Q.2). Let x = 001 G {0, 1} + be an input bitstring. Using 
the encryption algorithm, we encode x by y = 101. Given that P.I = 1, we can use the convolutional 
decryption algorithm to decode y into x (using the private keys m, P, and Q). 



7. (p, q) -adaptive codes 



In order to have more flexibility when developing applications based on adaptive codes, we introduce 
a natural generalization of adaptive codes, called (p, g)-adaptive codes. For example, extending the 
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algorithms presented in ifTTI to (p, g)-adaptive codes is expected to give better results. Let us give a 
formal definition. 

Definition 7.1. Let E and A be alphabets. A function c : E 9 x E- p — > A + is called (p, q)-adaptive 
code if its unique homomorphic extension c : E* — > A*, given by: 

• c(A) = A, 

• c(«Ticr 2 . . . cr m ) = c(<7i . . . a q , A) c(cr 2 . . . a q+ i,a%) . . . c(a p+ i . . . a p+q ,ai . . .a p ) 
for all strings o\02 ■ ■ ■ o~ m G E + , is injective. 

Developing applications based on (p, g)-adaptive codes is not a subject of this paper. The concept is 
presented here just to show how much flexibility we get when using various generalizations of adaptive 
codes. Let us give an example. 

Example 7.1. Let E = {a,b}, A = {0, 1} be two alphabets, and c : E 2 x E- 1 -> A+ a function 
given as in the table below. One can verify that c is injective, and according to Definition 17.11 c is an 
(1, 2)-adaptive code. 



Table 4. An (1, 2)-adaptive code. 



E 2 \E^ 


a 


b 


A 


aa 





11 


00 


ab 


10 


101 


11 


ba 


111 


01 


10 


ba 


110 


00 


01 



Let x = ababa € S + be an input data suing. Using the definition above, we encode x by 
c(x) = c(ab, A)c(ba,a)c(ab,b)c(ba,a) = 11111101111. 

8. Adaptive codes and time-varying codes 

Time- varying codes have been recently introduced in lH4l as a proper extension of L-codes [3 ]. Intu- 
itively, a time- varying code associates a codeword to the symbol being encoded depending on its position 
in the input data string. The connection to gsm-codes and SE-codes has been also discussed in [ 14 1. Sev- 
eral characterizations results for time- varying codes can be found in (T5\ . Let us now give a formal 
definition. 

Definition 8.1. Let E and A be two alphabets. A function c : E x N* — > A + is called time-varying code 
if its unique homomorphic extension c : E* — > A*, given by: 



c(A) = A, 
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• c{(Jia 2 ■ ..cr m ) = c(<7i, l)c(a 2 ,2) . . . c(a m ,m) 
for all strings o\(J2 ■ ■ ■ & m G S + , is injective. 

Motivation. This section is intended to introduce a new class of variable-length codes, called adaptive 
time-varying codes. Combining adaptive codes with time-varying codes can be useful when the input 
string consists of substrings with different characteristics. Let x = u\U2 ■ ■ ■ u t G S + be an input string, 
where u\, 112, ■ ■ ■ , ut are substrings with different characteristics. Instead of associating an adaptive 
code to x, it is desirable to associate an adaptive code to each substring For sure, this technique 
can be exploited further in data compression to improve the results. Combining adaptive codes with 
time-varying codes leads to the following encoding mechanism: the codeword associated to the current 
symbol being encoded depends not only on the previous symbols in the input string, but also on the 
position of the current symbol in the input string. A formal definition is given below. 

Definition 8.2. Let £ and A be alphabets. A function c : £ x S- n x N* — > A + is called adaptive 
time-varying code of order n if its unique homomorphic extension c : X* — ► A*, given by: 

• c(A) = A, 

• c(aia 2 ■ ..cr m ) = c(a\,X, 1) c(cr 2 ,cTi,2) . . . c(a n -i,aia 2 . . . a n - 2 ,n- 1) 
c(a n ,aia 2 ■ . .a n _i,n) c(a n+ x,axa 2 ...a n ,n + l) c(a n+ 2,o- 2 a 3 . ..a n+ i,n + 2) 
c(<7 n+ 3,cr 3 o- 4 • • • a n+ 2,n + 3) ... c(a 

mi 0~m—n@m—ri+l ■ ■ ■ 0~m—li fTl) 

for all strings o\a% . . . a. m G S + , is injective. 

Example 8.1. Let £ = {a,b}, A = {0, 1} be two alphabets, and let c : S x S- 2 x N* -> A+ be a 
function given by: 



zero[i] if a = a. 
oneli] if a = b. 



c(<7, u, i) 

for all (a, u, i) G S x S^ 2 x N*, where 
• zeroli] = 00 ... 0, 



• and one[i] = 11 . . . 1 . 

i 

One can verify that c is injective, and according to Definition l8.21 c is an adaptive time- varying code of 
order two. For example, the string x = abaa is encoded by 



c(x) = c(a, A, l)c(b, a, 2)c(a, ab, 3)c(a, ba, 4) = 0110000000. 
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9. Conclusions and further work 

Adaptive codes associate variable-length codewords to symbols being encoded depending on the previ- 
ous symbols in the input data string. This class of codes has been presented in 1111 as a new class of 
non-standard variable-length codes. Generalized adaptive codes (GA codes, for short) have been also 
presented in fTTll . not only as a new class of non-standard variable-length codes, but also as a natural 
generalization of adaptive codes of any order. 

In this paper, we contributed the following results. First, we proved that adaptive Huffman encodings 
and Lempel-Ziv encodings are particular cases of encodings by GA codes (sections 3 and 4). In section 
5, we proved that any (n, 1, m) convolutional code satisfying a certain condition can be modelled as an 
adaptive code of order m. This result was exploited further in section 6, where an efficient cryptographic 
scheme based on convolutional codes is described. An insightful analysis of this cryptographic scheme 
was provided in the same section. In sections 7 and 8, we extended adaptive codes to (p, q) -adaptive 
codes, and presented a new class of variable-length codes, called adaptive time-varying codes. 

Further work in this area is intended to establish new interesting connections between adaptive codes 
and other classes of codes, along with showing their effectiveness in concrete applications. Future direc- 
tions related to adaptive codes also include the data compression algorithms recently presented in 1111 . 
For example, combining the extensions described in sections 7 and 8 with the algorithms presented in 
lITTl may lead to better results. 
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